Robust Signed-Rank Variable Selection in Linear Regression
نویسنده
چکیده
The growing need for dealing with big data has made it necessary to find computationally efficient methods for identifying important factors to be considered in statistical modeling. In the linear model, the Lasso is an effective way of selecting variables using penalized regression. It has spawned substantial research in the area of variable selection for models that depend on a linear combination of predictors. However, work addressing the lack of optimality of variable selection when the model errors are not Gaussian and/or when the data contain gross outliers is scarce. We propose the weighted signed-rank Lasso as a robust and efficient alternative to least absolute deviations and least squares Lasso. The approach is appealing for use with big data since one can use data augmentation to perform the estimation as a single weighted L1 optimization problem. Selection and estimation consistency are theoretically established and evaluated via simulation studies. The results confirm the optimality of the rank-based approach for data with heavy-tailed and contaminated errors or data containing high-leverage points.
منابع مشابه
Statistical approach for selection of regression model during validation of bioanalytical method
The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit u...
متن کاملRank-based variable selection
This note considers variable selection in the robust linear model via R-estimates. The proposed rank-based approach is a generalization of the penalized least squares estimators where we replace the least squares loss function with Jaeckel’s (1972) dispersion function. Our rank-based method is robust to outliers in the errors and has roots in traditional nonparametric statistics for simple loca...
متن کاملSparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection in Multivariate Regression
The reduced-rank regression is an effective method to predict multiple response variables from the same set of predictor variables, because it can reduce the number of model parameters as well as take advantage of interrelations between the response variables and therefore improve predictive accuracy. We propose to add a new feature to the reduced-rank regression that allows selection of releva...
متن کاملRobust nonnegative garrote variable selection in linear regression
Robust selection of variables in a linear regression model is investigated. Many variable selection methods are available, but very few methods are designed to avoid sensitivity to vertical outliers aswell as to leverage points. The nonnegative garrotemethod is a powerful variable selection method, developed originally for linear regression but recently successfully extended to more complex reg...
متن کاملنمودار شوهارت ناپارامتری رتبه علامت دار با فاصله نمونه گیری متغیر
Nonparametric control chart based on rank is used for detecting changes in median(mean). In this article ,Signed-rank control chart is considered with variable sampling interval. We compared the performance of Signed-rank with variable sampling interval (VSI-SR) to Signed-rank with Fixed Sampling interval (FSI-SR),the numerical results demonstrated the VSI feature is so useful. Bakir[1] showed ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017